Probability issues in without replacement sampling
نویسندگان
چکیده
منابع مشابه
Probability Inequalities for Kernel Embeddings in Sampling without Replacement
The kernel embedding of distributions is a popular machine learning technique to manipulate probability distributions and is an integral part of numerous applications. Its empirical counterpart is an estimate from a finite set of samples from the distribution under consideration. However, for large-scale learning problems the empirical kernel embedding becomes infeasible to compute and approxim...
متن کاملOn the inclusion probabilities in some unequal probability sampling plans without replacement
Comparison results are obtained for the inclusion probabilities in some unequal probability sampling plans without replacement. For either successive sampling or Hájek’s rejective sampling, the larger the sample size, the more uniform the inclusion probabilities in the sense of majorization. In particular, the inclusion probabilities are more uniform than the drawing probabilities. For the same...
متن کاملWithout-Replacement Sampling for Stochastic Gradient Methods
Stochastic gradient methods for machine learning and optimization problems are usually analyzed assuming data points are sampled with replacement. In contrast, sampling without replacement is far less understood, yet in practice it is very common, often easier to implement, and usually performs better. In this paper, we provide competitive convergence guarantees for without-replacement sampling...
متن کاملWeighted Sampling Without Replacement from Data Streams
Weighted sampling without replacement has proved to be a very important tool in designing new algorithms. Efraimidis and Spirakis (IPL 2006) presented an algorithm for weighted sampling without replacement from data streams. Their algorithm works under the assumption of precise computations over the interval [0, 1]. Cohen and Kaplan (VLDB 2008) used similar methods for their bottom-k sketches. ...
متن کاملAccelerating weighted random sampling without replacement
Random sampling from discrete populations is one of the basic primitives in statistical computing. This article briefly introduces weighted and unweighted sampling with and without replacement. The case of weighted sampling without replacement appears to be most difficult to implement efficiently, which might be one reason why the R implementation performs slowly for large problem sizes. This p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Mathematical Education in Science and Technology
سال: 2007
ISSN: 0020-739X,1464-5211
DOI: 10.1080/00207390701228385